Punctuation has a point, so use it!

نویسندگان

  • Robin L. Hill
  • Wayne S. Murray
چکیده

It is all too common for systems processing natural language, whether for input (automatic speech recognition, text queries, dialogue etc.) or output (text-to-speech), to ignore or strip out punctuation. The effect of prosodic factors, such as intonation and pausing, on language processing remains controversial. While there is an obvious relationship between punctuation and prosody it cannot be a simple mapping: grammatical rules prevent the inclusion of punctuation at points where a speaker might pause, and the set of punctuation is not rich enough to transcribe all the spoken features categorised as prosody. It is therefore important for any realistic text-to-speech (or speech-to-text) conversion to consider these important features of language. An experimental investigation showed that commas exert a consistently strong and direct rhetorical influence on sentences being read aloud. They result in the slower delivery of words preceding the comma and an increase in pauses in speech. While the lengthening effect is an uncontroversial feature found at the end of clauses, even in the absence of punctuation, there is evidence to suggest that the comma is particularly useful in acoustically segmenting text by stimulating a gap, or period of silence, between linguistic units. This is particularly salient at points where a break can convey disambiguating information. Somewhat surprisingly, commas do not induce shifts in the fundamental frequency of speech or alter intonational patterns. Any generation of naturalistic synthetic speech should therefore take these factors into consideration. BODY Too many TTS and ASR systems ignore punctuation. Mistake! Little things can make a difference for the better. Volume 2 of Tiny Transactions on Computer Science This content is released under the Creative Commons Attribution-NonCommercial ShareAlike License. Permission to make digital or hard copies of all or part of this work is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. CC BY-NC-SA 3.0: http://creativecommons.org/licenses/by-nc-sa/3.0/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring The Role Of Punctuation In Parsing Natural Text

Few, if any, current NLP systems make any significant use of punctuation. Intuitively, a treatment of lrunctuation seems necessary to the analysis and production of text. Whilst this has been suggested in the fiekls of discourse strnetnre, it is still nnclear whether punctuation can help in the syntactic field. This investigation atteml)ts to answer this question by parsing some corpus-based ma...

متن کامل

Commas and Spaces: The Point of Punctuation

While it has been widely assumed that punctuation may play a critical role in parsing, there has been relatively little direct empirical investigation of its effects. Most researchers have either avoided the use of punctuation or have simply assumed that it will serve a disambiguating role. There has been little or no consideration of how ’disambiguation’ might occur or whether it is equally ef...

متن کامل

Punctuated Parsing: Signposts Along the Garden-Path

Although there has been some speculation concerning the role played by punctuation in parsing, there has been amazingly little empirical investigation of the issue. Punctuation appears to be a widely neglected topic. For the most part, where punctuation has been included in parsing studies, investigators have simply assumed that punctuation, such as commas, can be used to effectively disambigua...

متن کامل

Punctuating confusion networks for speech translation

Translating from confusion networks (CNs) has been proven to be more effective than translating from single best hypotheses. Moreover, it is widely accepted that the availability of good punctuation marks in the input can improve translation quality. At present, no ASR systems can generate punctuation marks in the word graphs, therefore CNs miss punctuation. In this paper we investigate the pro...

متن کامل

Incorporating Punctuation Into the Sentence Grammar: A Lexicalized Tree Adjoining Grammar Perspective

Punctuation helps us to structure, and thus to understand, texts. Many uses of punctuation straddle the line between syntax and discourse, because they serve to combine multiple propositions within a single orthographic sentence. They allow us to insert discourse-level relations at the level of a single sentence. Just as people make use of information from punctuation in processing what they re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TinyToCS

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2013